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Abstract — In this paper I propose B-Rank, an efficient 
ranking algorithm for recommender systems. B-Rank is 
based on a random walk model on hypergraphs. Depending 
on the setup, B-Rank outperforms other state of the art 
algorithms in terms of precision, recall ~ (19% 50%) 
and inter list diversity ~ (20% — 60%). B-Rank captures 
well the difference between popular and niche objects. The 
proposed algorithm produces very promising results for 
sparse and dense voting matrices. Furthermore, I introduce 
a recommendation list update algorithm to cope with new 
votes. This technique significantly reduces computational 
complexity. The algorithm implementation is simple, since 
B-Rank needs no parameter tuning. 

I. Introduction 

One of the most amazing trends of today's globalized 
economy is peer production [1]. An unprecedented mass 
of unpaid workers is contributing to the growth of the 
World Wide Web: some build entire pages, some only 
drop casual comments, having no other reward than 
reputation [2]. Many successful web sites (e.g. Blogger 
and MySpace) are just platforms holding user-generated 
content. The information thus conveyed is particularly 
valuable because it contains personal opinions, with no 
specific corporate interest. It is, at the same time, very 
hard to go through it and judge its degree of reliability. If 
you want to use it, you need to filter this information, 
select what is relevant and aggregate it; you need to 
reduce the information overload [3]. 

As a matter of fact, opinion filtering has become rather 
common on the web. There exist search engines (e.g. 
Google news) that are able to extract news from journals, 
web sites (e.g. Digg) that harvest them from blogs, 
platforms (e.g. Epinions) that collect and aggregate votes 
on products. The basic version of these systems ranks the 
objects once for all, assuming they have an intrinsic value, 
independent of the personal taste of the demander [4]. 
They lack personalization [5], which constitutes the new 
frontier of online services. 

Users need only browse the web in order to leave 
recorded traces, the eventual comments they drop add 
on to it. The more information you release, the better 
the service you receive. Personal information can, in 
fact, be exploited by recommender systems. The deal 
becomes, at the same time, beneficial to the community, 
as every piece of information can potentially improve the 
filtering procedures. Amazon.com, for instance, uses one's 
purchase history to provide individual suggestions. If you 
have bought a physics book, Amazon recommends you 
other physics books: this is called item-based or content- 
based recommendation [6], [7]. Many different techniques 



have been developed in the past, including collaborative 
filtering methods [6], [8]-[10], content-based techniques 
[11]-[14], spectral based methods [15]-[17] and network 
based algorithms [18]-[20]. 

The evaluation of recommender systems is difficult 
[21]. There are several reasons for this: a) an algorithm 
may perform well on a particular data set but fails on 
others, b) in the past, evaluations focused on predictive 
accuracy of withheld ratings. Novelty and diversity were 
mostly ignored. These two factors play a pivotal role from 
a user point of view [22]. c) there is still no common 
framework in the community, defining a set of evaluation 
metrics. Such a framework would be particular useful, 
when comparing different techniques based on different 
data sets. 

In this paper I propose B-Rank, a novel top N recom- 
mendation algorithm. B-Rank is based on a markov chain 
model [23] on hypergraphs [24]. The algorithm produces 
high precision and recall performance, maintaining high 
diversity between different recommendation lists at the 
same time. B-Rank is parameter free, which is very 
attractive from an implementation point of view. The 
performance is measured on two complementary data 
sets - movielens and jester. I compare B-Rank to ZLZ- 
II [20], [25], which is known to be superior to ordinary 
collaborative filtering methods in the investigated setup. 
GRank [20], a global ranking method, serves as a base 
benchmark. 

II. Methods 

User ratings are stored in a matrix V{OxU). O denotes 
the number of objects and U is the number of users. Vai S 
V corresponds to user i's rating to object a. Throughout 
this paper objects are labeled by Greek letters, whereas 
people are identified by Latin letters. 

A. B-Rank 

B-Rank is based on a random walk model with given 
initial conditions taking place on a hypergraph 
. A transition matrix P is associated with the hypergraph 
G- Pap G P denotes the transition probability from 
object a to object /?. X{i) is a normalized column vector, 
representing user i's preference, i.e the collection of 
already voted objects: X(i)/3 ~ 1/ X^a 'S*ff'^(^cti)' where 
sign{x) = 1 if x > 0, and otherwise. 

'a hypergraph Q{V,E) is a finite set V of vertices together with 
a finite multiset E of hyperedges, which are arbitrary subsets of 
V. The incidence matrix H of a hypergraph Q{V, E) with E = 
{ei, 62, ■ ■ ■ , Bm} and V = {vi, V2, ■ ■ ■ , Vm} is the m X m matrix 
with hij = 1 if Vj ^ Ci and otherwise. 



In the hypergraph framework, each user i is modeled 
as a hyperedge and each object a is a hypergraph vertex. 
I define the transition matrix P of Q like: 

PaP ^ ~ Sal3)jr'^Wihaihif:i. (1) 

t 

Where = Y^p^aY^iWihaihip, Sap is the Kronecker 
Delta and Wi is the associated weight to hyperedge i. 
In matrix formulation P reads as P = D^^A with 
the symmetric adjacency matrix A = HWH^ — T. 
D is a diagonal matrix containing the row sums of A, 
da = Yp ^afi- H is the incidence matrix and its 
transposed. W is the diagonal hyperedge weight matrix 
and T is the diagonal vertex degree matrix with Tq = 
^aP Tlii Wihiahip. P is a row stochastic matrix with zero 
diagonal by construction. 

B-Rank calculates user z's recommendation list /(j) as 
follows: 

1) Forward propagation: ff^•^ = P^X(») 

2) Backward propagation: /^^^ = 

3) Final ranking: /(,) = /(^)#/(f) f\ 

4) Set already voted objects to zero. 

5) Sort /(j) in descending order 

6) Select top N items of the sorted list /(i) 

In this paper I focus on the unweighted version of B- 
Rank. Namely, I set Wi = 1 Vi. This is done, to have 
straight forward comparisons to similar algorithms. The 
more complex case Wi ^ IVi will be discussed in 
a follow up paper Different aspects of the algorithm, 
computational issues and ranking stability questions are 
outUned in Sect |VI-A| and Sect |VI-B] respectively. 

B. ZLZ-II 

ZLZ-II [20], [25], is based on a lazy random walk 
process, taking place on a bipartite user-object graph. 
ZLZ-II uses a coarse grained version of the original voting 
matrix. Uai = 1 if Vai > Vtr, and else. Vai is the 
original vote and Qai is the transformed vote used in 
ZLZ-II. is a threshold, to be selected. In general 
only "positive" votes are kept and the rest is discarded. 
Objects are assigned with a initial "resource" /. The 
given resources / are re-distributed according the linear 
transformation: fi = Wff^iy The resulting fi is user i's 
recommendation list. Like in B-Rank, this vector is sorted 
in descending order and the top N objects are presented 
as recommendations. is a column stochastic matrix: 

kp is the number of votes given to object /3 and kj is 
the number of objects voted by user j. Note, that Ws 
diagonal is non-zero (lazy random walk). 



^#denotes the element-wise multiplication of two vectors. 



C. Collaborative Filtering 

Collaborative Filtering (CollabF) is perhaps the most 
popular recommendation method [26]. It is based on user- 
user linear correlations as a similarity measure. 

N 

where v'^^ is the predicted vote, (vi) the average vote 
expressed by user i and S is the similarity matrix. In 
this paper I use a common correlation measure (pearson 
correlation [27]), to calculate S. 

^ Ea(^»a - (^»))(^ja " fa)) 

- fa>)VEafaa " fa))^' 

with Sij = if users i and j haven't judged more 
than one item in common. User j's recommendation Ust 
is generated by stacking vjp in a vector, sorting the 
elements in descending order and following the procedure 
described in Sect lll-AI 

D. GRank 

GRank is a global ranking scheme. Objects are ranked 
according their popularity (number of votes): ka- Unlike 
B-Rank, CollabF and ZLZ-ll, GRank takes not into 
account users personality, since it generates the same 
recommendation list for every user participating in the 
system. The ranking list is given by sorting the objects 
according their popularity in descending order. 

E. Data set 

I used two data sets to test B-Rank. MovieLens 
(movielens.umn.edu), a web service from GroupLens 
(grouplens.org). Ratings are recorded on a five stars 
scale. The data set contains 1682 movies x 943 users. 
Only 6,5% of possible votes are expressed. Jester 
(shadow.ieorBerkeley.edu/humor), an online joke recom- 
mender system. The data set contains 73421 users x 100 
jokes. In contrast to MovieLens, the data set from jester 
is dense: 75% of all votes are expressed. The rating scale 
are real numbers between —10 and 10. 

Apart from the sparsity and the dimensional ratio (num- 
ber of object vs. number of users), the most fundamental 
difference is the amount of a priori information accessible 
to users. People choose movies they want to see on the 
basis of many different information sources. They know 
actors, they read reviews, they ask friends for feedback 
etc. When users buy their tickets, they akeady did a pre- 
selection. On the other hand no pre-selection is possible 
with onUne jokes. 

In this sense, the two data sets are complementary. Tests 
on diverse data sets are more meaningful in general [21]. 
For a discussion on different performance aspects, see 
[17]. 



F. Accuracy evaluation 

To test the algorithms I divided the data in two disjoint 
sets, a training set Str and a test set Sts- The training set 
is used to predict missing votes contained sin the test set. 

I implemented four different evaluation metrics: recall, 
precision, Fl and diversity. The last is adopted from [20]. 
Recall for user i is defined as the number of recovered 
items di in the top N places of the recommendation list, 
divided by the number of items Di in the test set for that 
user, thus PRi — di{N)/Di. Averaging over all users 
gives the final score for recall PR. Precision measures the 
number of recovered items in the top N places divided 
by the length of the recommendation list N . For user i 
we have PPi = di{N)/N. The overall precision PP is 
obtained by averaging over all PPi. 

Increasing N (length of the recommendation list) usu- 
ally increases recall and decreases precision at the same 
time. To balance out these effects, it is common to use the 
Fl metric, the geometrical mean of recall and precision: 
F1 = (2*PR* PP) /{PR + PP). 

To test the diversity between different recommendation 
lists I use h{N), a metric proposed in [20]. The metric 
measures the diversity in the top N places of two different 
recommendation lists. hij{N) = 1 — {qij{N) /N), where 
qij{N) denotes the number of common items in the top N 
places of list i and j. hij = 1 means there are no common 
items in the two lists, whereas hij — means complete 
match. Averaging over all hij{N) gives the population 
personalization level h{N). 

Each experiment was done on 20 different instances 
- i.e. different splittings for training and test set with a 
fixed ratio (number of votes in the test set vs. number 
of votes in the training set). Final scores for all metrics 
were obtained by averaging over all instance results. All 
methods (B-Rank, ZLZ-II, GRank) were tested with the 
same instances, to make a fair comparison. 





B-Rank 


ZLZ II 


CoIIabF 


GRank 


Improvement 


PR 


0.21 


0.19 


0.11 


0.03 




PP 


0.24 


0.22 


0.18 


0.04 




Fl 


0.22 


0.20 


0.12 


0.03 


10% 


h 


0.84 


0.71 


0.74 


0.22 


18% 



TABLE II 

MOVIELENS: N = 10, TEST SET = 20% OF EXPRESSED VOTES 





B-Rank 


ZLZ II 


CollabF 


GRank 


Improvement 


PR 


0.29 


0.27 


0.12 


0.05 




PP 


0.33 


0.32 


0.32 


0.08 




Fl 


0.31 


0.29 


0.15 


0.06 


7% 


h 


0.77 


0.64 


0.74 


0.15 


20% 



TABLE III 

MOVIELENS FOR = 20, TEST SET = 70% OF EXPRESSED VOTES 





B-Rank 


ZLZ II 


CollabF 


GRank 


Improvement 


PR 


0.17 


0.15 


0.07 


0.02 




PP 


0.43 


0.40 


0.37 


0.11 




Fl 


0.24 


0.21 


0.11 


0.03 


14% 


h 


0.80 


0.64 


0.78 


0.15 


25% 



TABLE IV 

MOVIELENS FOR A'' = 10, TEST SET = 70% OF EXPRESSED VOTES 





B-Rank 


ZLZ II 


CollabF 


GRank 


Improvement 


PR 


0.86 


0.72 


0.66 


0.48 




PP 


0.38 


0.31 


0.33 


0.23 




Fl 


0.53 


0.43 


0.41 


0.31 


23% 


h 


0.71 


0.65 


0.66 


0.52 


9% 



TABLE V 

Jester = 20, test set = 20% of expressed votes 





B-Rank 


ZLZ II 


CollabF 


GRank 


Improvement 


PR 


0.72 


0.50 


0.49 


0.20 




PP 


0.59 


0.41 


0.46 


0.20 




Fl 


0.64 


0.45 


0.44 


0.20 


42% 


h 


0.80 


0.72 


0.75 


0.52 


11% 



TABLE VI 

Jester = 10, test set = 20% of expressed votes 



III. Results 

The main results are collected in Tables ID lVIIII Bold 
figures indicate best result for a given evaluation metric. 
The length of the recommendation list N was set to 
iV = 20 and TV = 10 for all experiments. The perfor- 
mance improvement is measured relative to the ZLZII 
algorithm. There is a tendency toward higher improve- 
ments for shorter recommendation lists (N = 10). Best 
improvements are achieved for diversity between different 
recommendation lists. 





B-Rank 


ZLZ II 


CollabF 


GRank 


Improvement 


PR 


0.34 


0.30 


0.18 


0.05 




PP 


0.20 


0.17 


0.15 


0.04 




Fl 


0.25 


0.21 


0.14 


0.04 


19% 


h 


0.81 


0.68 


0.70 


0.19 


19% 








TABLE I 







MOVIELENS; N = 20, TEST SET = 20% OF EXPRESSED VOTES 





B-Rank 


ZLZ II 


CollabF 


GRank 


Improvement 


PR 


0.66 


0.53 


0.44 


0.35 




PP 


0.72 


0.53 


0.69 


0.35 




Fl 


0.68 


0.53 


0.51 


0.35 


28% 


h 


0.69 


0.44 


0.52 


0.32 


57% 








TABLE VII 






Jester A" 


= 20, TEST SET = 70% OF EXPRESSED VOTES 




B-Rank 


ZLZ II 


CollabF 


GRank 


Improvement 


PR 


0.49 


0.31 


0.25 


0.13 




PP 


0.90 


0.68 


0.75 


0.35 




Fl 


0.63 


0.42 


0.36 


0.19 


50% 


h 


0.80 


0.50 




0.63 0.33 


60% 



TABLE VIII 

Jester A" = 10, test set = 70% of expressed votes 



IV. Discussion 

Results show significant performance improvement in 
all experiments. B-Rank is able to perform well on 



complementary datasets. However, like every experiment 
with recommender systems, results are always 'bound' 
on used data sets. There is no guaranty to obtain similar 
results for different data. 

The best improvement, compared to ZLZ II, is achieved 
for inter list diversity. This result highlights the fact, that 
B-Rank can cope with users personality. From real world 
experiments we know, that higher diversity is positive 
correlated to user satisfaction in general [22]. However, 
user satisfaction is hard to measure in off-line experiments 
and user feedback is needed to draw robust conclusions. 

An extension to the ZLZ II algorithm was proposed 
by [28], where the authors reached a comparable per- 
formance for diversity like B-Rank in the movielens 
dataset. Their method includes a tuning parameter A. B- 
Rank in contrast is parameter free and therefore easier to 
implement and maintain. 

Extensions to B-Rank may increase improvements 
again. One extension to the presented basic B-Rank 
algorithm is a non constant weight matrix W. This will 
be discussed in a follow up paper. Another extension is to 
take into account ?i-step propagation (indirect connections 
between two objects a and /3). Tests for different n > 
1 significantly dropped recall, precision and inter list 
diversity performance as well. One explanation for this 
behavior is a propagation reinforcement of popular items. 

The basic version of B-Rank can be extended by 
introducing a user dependent parameter ri{i), controlling 
the contribution of backward and forward propagation: 

/(O = (/(f) j * (/(f) j ■ S^'^h a parameter is a 
fine tuning of user i's preferences for popular and niche 
objects. Also a user independent, global ?/ is possible. All 
these extensions increase computational compexity, since 
the system have to learn the 'correct' parameters. 

Extensions and non trivial weight matrices W will be 
investigated and presented in a follow up paper. 

V. Summary 

In this paper I proposed B-Rank, a new top N rec- 
ommendation algorithm. The algorithm is based on a 
random walk model on hypergraphs. B-Rank is easy to 
implement and needs no parameter tuning. The algorithm 
outperforms other state-of-the-art methods like ZLZII [25] 
and Collaborative Filtering in terms of accuracy and inter 
list diversity. 

B-Rank is able to find interesting 'blockbusters' and 
niche objects as well. The algorithm is very promising 
for different applications, since it produces good results 
for sparse and dense voting matrices as well. Further- 
more, I introduced a simple recommendation list update 
algorithm, which reduces computational complexity dra- 



matically. Sect VI-B 




VI. Additional material 

A. General remarks on B-Rank 

To highlight various aspects of B-Rank, I introduce 
a toy network Fig.([T]l. For simplicity all links between 
objects and users are equally weighted Wi = 1 Vi. 



Fig. 1. Toy net to illustrate B-Rank. Circles represent hyperedges 
(users), squares are hypervertices, i.e. objects. The votes are illustrated 
as links between objects and users. 



First, I discuss some general aspects, second I show 
how all aspects are well captured by the B-Rank algo- 
rithm. 

Case A: huge audience in common: Intuitively, two 
objects a and /3 are similar to each other, when they 
share many users - i.e. they have many hyperedges in 
common. Let's assume object a and object f3 share many 
users and user i voted for a but didn't vote for f3 yet. 
Then it's reasonable to recommend f3 to user i. Such 
a recommendation strategy clearly favors "blockbusters", 
objects rated by almost every user in the community (e.g 
objects 1 and 3 in the toy network Fig.Q. 

Case B: exclusive audience: Look at object 5 in the 
toy network: this object is exclusively rated by user 4. 
Moreover, object 4 and object 5 share only user 4 and 
object 4 was not rated by many other users. In this sense, 
object 4 and 5 have an exclusive audience in common. 
It is reasonable to mark these objects as very similar and 
to recommend one of them to users who have not rated 
both. 

Do the random walk: I define a path (a (3) as 
an ordered triple {q;,z,/3} with a ^ [3 (i.e object, user, 
object). The transition probability in Eq.(|T|) counts 
the number of paths (triples) starting at a and ending 
at /3, divided by the number of all paths starting at a. 
Examples: for P12 we count 6 paths starting at object 1. 
Two of them ending at object 2, thus P12 = 1/3. For 
P13 we count again 6 paths starting at 1, and three paths 
ending at object 3, thus P13 — 1 /2. Note, that Pap 7^ Ppa 
in general, and Paa — Va. 

Put everything together: To demonstrate the effect 
of forward and backward propagation in B-Rank we 
use a basic preference vector x — [0,0,0,1,0]"^ and 
the topology of the toy net in Fig.([T]). For the forward 
propagation = P^x we get: 

= [1/3, 0,1/3, 0,1/3]^ 

The obtained figures for objects a ^ 4 indicate the 
probability for a random walker starting at object 4 and 
landing at a 7^ 4. Note, the scores are the same for objects 
1,3,5. Object 2 obtains no score, because there is no 
simple path from object 4 to object 2. Object 4 obtains 
no score since the path {4, i,4} is not a valid path per 
definition. For the backward propagation = Px we 
get: 



f = [1/6,0,1/6,0,1]^ 



The backward propagation vector contains the probabil- 
ities for a random walker starting at objects a 7^ 4 and 
landing at object 4. We observe the same score for object 
1 and object 3, but a much higher score for object 5, since 
the probability for a random walker starting at object 5 
and ending at object 4 is much higher, then the probabiUty 
reaching object 4 from another node. 

The final score / is given by the element wise multi- 
plication of /^#/^. Thus 

/ = [1/18,0,1/18,0,1/3]^ 

The final score for each object a 7^ 4 has a simple 
interpretation: it is the probability for a random walker 
starting at object 4, visiting object a and come back to 
object 4. 

The higher score of object 5 makes sense in the 
given setup, because objects 4 and 5 share an exclusive 
audience, furthermore object 4 is only 'loosely' connected 
to all other objects. 

B-Rank captures well the possible configurations de- 
scribed in case A and B. If an object a has many links 
and shares most of them with another object then /3 
is reached with higher probability then other objects, less 
connected (number of paths) to a. On the other hand, if 
an object a has many connections, but shares exclusively 
some hyperedges (users) with an object /3, then may 
give low resource to /3, but will give a high score to the 
same object /5. In summary: B-Rank takes into account 
propagation of popular and niche objects as well. 

Introducing hyperedge weights, described in Sec. ([IF] 
[a) , is a generalization of the procedure described in 
this appendix. It is not clear, what weight function is an 
appropriate choice. This issue will be investigated in an 
follow up paper 



B. Computational issues 

I focus on two different computational aspects: 1) I 
show, how to make an efficient 'real-time' recommenda- 
tion without performing the matrix-vector multiplication 
needed by B-Rank and 2) I give an update algorithm for 
the transition matrix P Pnew, avoiding matrix-matrix 
multiplication to calculate Pnew Note: the matrix-matrix 
multiplication is needed to calculate the adjacency matrix 
A for the hypergraph. 

1) Offline-Online tasks in B-Rank: To calculate user 
j's recommendation list /(j), one has to perform two 
matrix-vector multiplications - steps 1 and 2 described 
in Sec.(II-Ai - and an element-wise multiplication of two 
vectors. We can reduce the effort to compute the matrix- 
vector multiplications. The idea is simple: calculate object 
specific basis representations and for the forward 
and backward propagation vectors, independent of all 
users. The recommendation task (online) for a user i is 
then reduced to calculate a linear combination of the basis 
forward and backward propagation vectors. 



a) Offline: The basis representations b^ and b^ 
vectors are defined as follows: 



6f = Pe„ 



Bq G M is a natural basis vector, where the dimension 
O is given by the number of objects. 

b) Online: The forward and backward propagation 
vectors for user i are then given by: 



With Ca — X(i)a- The final calculation of user z's ranking 
list /(j-j is given by step 3, described in Sec.(II-Ai. 



Note: using this shortcut produces different figures in the 
recommendation lists, compared to the ones, generated 
by the procedure in Sec.( |II-A] i. However, the ranking 
(ordered list) will be the same. 

The online part is easily done, since the calculation 
essentially reduces to calculate a linear combination of 
rows and columns from the transition matrix P. 

2) Update algorithm: The main effort to calculate the 
transition matrix P consists of a matrix-matrix multiplica- 
tion to compute the adjacency matrix A of the hypergraph. 
A naive way to maintain the system would be a re- 
calculation of P — > Pnew, every time a user rated an 
object. I give a simple update algorithm for P Pnew 
I write the transition matrix P like: 

PiH) = [GiH)]-^ F{H), F{H) = HH^ - (xH^)^, 

G{H) = {yF{H)f. 

The superscript D denotes a diagonal matrix. H is the 
incidence matrix defined in Sec.( |II-A] l. x and y are row 
vectors of appropriate format containing all ones. Then 
xW is a vector containing the column sums of a matrix 
W. I define the updated matrix Pnew{H) as: 

Pnew{H) = iPiH)+AP{H)) 

= [G{H) + AG{H)]-^[F{H) + AF{H)]. 

AX{H) denotes the change in X when changing H. For 
AG{H) and AF{H) we have: 



AG{H) = [yAF{H)] 



D 



AF{H) = \{AU)U^\ 



[H{AH^)] 

Kah)hT- 



a) Single vote manipulation: I further investigate the 
update algorithm in case of one additional vote in the 
incidence matrix H. To model a one vote change in H, 
I define the single-entry matrix J*^ e M"^™, which is 
zero everywhere except in the {i,j)th entry, which is 1. 
Assume a matrix A{n x m) and a matrix J^^{m x p), 
then 

Ar^ = [0 0---Ai---0 0] 

is a n X p matrix with the i.th column of A in place of the 
j.th column. Conversely, assume A{nxm) and ,P^{pxn). 
Then, A is apxm matrix, with the j.th row of A in the 



place of the i.th row. These operations are only column 
and row swapping of a matrix. 

For a single vote change, I set AiJ = J'^'^iO x U). 
For Pnew{H) we have: 

r rri-\ ±y } ^ 



G{H)+y r^H^+{r^H^) 



is very efficient - 0{o) at most instead of 0{u^o). 
b) Many vote manipulation: The generalization of 
single vote manipulations is straightforward, since a many 
vote update is represented by a combination of single vote 
updates. 
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